The Derivation of Orientation - Invariant Shape Representation in Visual Object Recognition
نویسنده
چکیده
While previous studies suggest that the recognition of misoriented objects may be orientation-dependent, or orientation-invariant, the functional independence of the mechanisms underlying these contrasting patterns of data remains unclear. In particular, it has been widely reported that orientation-invariant performance only emerges following spatial normalisation of misoriented objects on early blocks of trials, suggesting that the computation of orientation-invariant shape representations is not functionally independent of orientation-dependent processes. This issue is examined in two experiments contrasting recognition latencies, across blocks of trials, for symmetrical and asymmetrical 2D novel forms have previously been shown to elicit orientationinvariant performance in shape recognition tasks. The results show that orientation-invariance, with both stimulus types, does not depend on spatial normalisation of misoriented stimuli on early blocks of trials. In contrast to some recent claims, this finding shows that there are some kinds of orientationinvariant shape representations that can be computed independently of spatial normalisation mechanisms. Orientation-invariant object recognition 3 One fundamental issue in research on human visual object recognition concerns how we are able to identify objects across changes in stimulus orientation. The primary source of evidence about the cognitive mechanisms underlying this ability comes from studies of misoriented object recognition (e.g., Arguin & Leek, 1995; Arguin & Leek, in press; Biederman & Gerhardstein, 1993; Jolicoeur, 1985; Jolicoeur, 1990; Jolicoeur & Milliken, 1989; Jolicoeur & Humphrey, 1998; Lawson, 1999; Leek, 1998a; 1998b, Maki, 1986; McMullen & Farah, 1991; Murray, 1999; Rock, 1973; Takano, 1989; Tarr, Williams, Haywood & Gauthier, 1998; Tarr & Pinker, 1989; 1990; Tarr & Bülthoff, 1998). Numerous studies have shown that the time taken to recognise objects can be sensitive to stimulus orientation. Typically, response times (RTs), in shape recognition tasks, increase as a function of the angular disparity between stimulus orientation and an object’s ‘upright’ or most familiar orientation/s (e.g., Jolicoeur, 1985; Jolicoeur & Milliken, 1989; Jolicoeur & Humphrey, 1998; Lawson, 1999; Leek, 1998b; Maki, 1986; McMullen & Farah, 1991; Murray, 1999; Rock, 1973; Tarr, 1995; Tarr & Pinker, 1989; 1990; Tarr & Bülthoff, 1998). This finding suggests that, at least under certain circumstances, the recognition of misoriented objects involves some form of spatial normalisation. Several hypotheses about the nature of this normalisation process have been advanced. These include accounts based on analogue ‘mental rotation’ (e.g., Jolicoeur, 1985; 1990; Rock, 1973; Shepherd & Metzler, 1971; Tarr & Pinker, 1989)1, view interpolation (e.g., Edelman & Bülthoff, 1992), the response 1 Some recent evidence has challenged this proposal – including data showing that normalisation rates are sometimes non-linear which is not clearly predicted by analogue mental rotation (e.g., Lawson & Jolicoeur, 1999; see also Lawson, 1999), and from case studies of neurological patients (e.g., Farah & Hammond, 1988; Turnbull & McCarthy, 1996), and functional brain imaging (e.g., Gauthier et al., 2002) reporting dissociations between mental rotation and misoriented object recognition. For this reason, in this paper we refer only to the more Orientation-invariant object recognition 4 properties (i.e., tuning functions) of neural population vectors (e.g., Perrett, Oram & Ashbridge, 1998), and the incremental spread of activation over networks of orientation (or viewpoint)-dependent shape descriptions (e.g., Edelman and Weinshall; 1991; 1998). In contrast, there is also evidence that visual recognition can sometimes be orientationinvariant: that is, some objects can be identified equally quickly regardless of stimulus orientation (Biederman & Bar, 1999; Biederman & Gerhardstein, 1993; Cohen & Kubovy, 1993; Corballis, Zbrodoff, Shetzer & Butler, 1978; Corballis & Nagourney, 1978; DeCaro & Reeves, 2002; Eley, 1982; Farah, Rochlin & Kline, 1994; Leek, 1998a, McKone & Grenfell, 1999; McMullen & Farah, 1991; Murray, Jolicoeur, McMullen & Ingleton, 1993; Takano, 1989; Tarr & Pinker, 1990; Wiser, 1981). For example, in Tarr and Pinker (1990) participants memorised sets of two-dimensional novel shapes at a single image plane orientation. Recognition memory was later tested by presenting the memorised stimuli for identification, along with visually similar distracters, at both the practiced as well as several unfamiliar ‘test’ orientations. The results showed an orientation-dependent (i.e., monotonically increasing) pattern of RTs for only one of the four sets of stimuli. In order to account for these results, Tarr and Pinker (1990) argued that stimuli showing orientation-invariant performance could each be uniquely identified in terms of a one-dimensional object-centred shape description; that is, a representation that specifies the relative ordering of shape features along a single internal shape axis. In contrast, stimuli showing orientation-dependent performance could only be discriminated from each other through the specification of their feature configuration along two-dimensions (that is, in terms of x and y co-ordinates). Other studies have suggested orientationinvariance might also be achieved through the identification of local, so-called ‘free floating’ (Jolicoeur, 1990) or ‘orientation-free’ (Takano, 1989), features that serve to define object identity regardless of global shape orientation (e.g., Biederman & Bar, 1999; Jolicoeur, 1990; Jolicoeur & Milliken, 1989; Jolicoeur & Humphrey, 1998; Just & Carpenter, 1985; Murray et al., 1993; Takano, 1989). These features might include invariant properties of edges, vertices, and their local neutral term spatial normalisation, rather than mental rotation, when referring to the mechanisms underlying orientation effects. Orientation-invariant object recognition 5 configuration (e.g., Biederman & Bar, 1999; Takano, 1989; Thacker, Riocreux & Yates, 1994)2. However, the exact conditions under which recognition is orientation-dependent, or orientationinvariant, seem to depend on a variety of stimulus and task variables (e.g., Dickerson & Humphreys, 1999; Hamm & McMullen, 1998; Jolicoeur & Milliken, 1989; Leek et al., 1998a; Vanrie, Willems & Wagemans, 2001). One unresolved issue is whether orientation-invariance can be mediated by processes that are functionally distinct from those involved in orientation-dependent spatial normalisation. It is clear that orientation-invariant performance, under some conditions, can be accounted for within the context of orientation-dependent theories of shape recognition (e.g., Tarr & Pinker, 1989; Tarr, 1990; Tarr & Bülthoff, 1998). For example, one well-documented finding is that orientation effects often diminish with practice, resulting in orientation-invariant performance on later blocks of trials (e.g., Jolicoeur, 1985; Jolicoeur & Humphrey, 1998; Lawson, 1999; Maki, 1986; Murray, 1999; Tarr & Pinker, 1989). This practice effect is consistent with the encoding of orientation (or viewpoint) specific representations across a range of stimulus orientations (e.g., Lawson & Jolicoeur, 1999; Leek, 1998a; 1998b; Tarr, 1995; Tarr & Pinker, 1989; Tarr & Bülthoff, 1998): if a sufficiently large number of different orientation-specific representations are encoded, recognition latencies may appear to be invariant to changes in stimulus orientation because the time taken to spatially normalise misoriented objects to the nearest stored shape description will be relatively constant or negligible. The hypothesis that the derivation of orientation-invariant shape representations is dependent on spatial normalisation mechanisms is also supported by findings from two other studies. Jolicoeur and Milliken (1989) examined the effects of naming upright, and rotated, objects on recognition latencies for the same and new objects in subsequent blocks. They found that effects of stimulus orientation diminished only when the same objects had been presented in non-upright orientations, or in the context of other misoriented stimuli, in earlier blocks of trials. Misoriented objects that had previously only been presented at upright orientations continued to undergo spatial normalisation on subsequent presentations. Additional supporting evidence has also been found by McMullen and 2 It is also necessary to specify, on this account (although rarely done so) how the individual local features, or combinations of features, are identified independently of their orientation. One possibility is that they are Orientation-invariant object recognition 6 Farah (1993). They tested the generality of the findings of Tarr and Pinker (1990) in a task involving the recognition of line drawings of common objects. The results showed that while orientation effects for common objects with an axis of symmetry in the image place do diminish more quickly than those for objects with no axis of symmetry, they still show orientation-dependent performance on early blocks of trials. The data from these studies suggests that spatial normalisation, on initial presentations of objects, may be a prerequisite to the derivation of orientation-invariant shape descriptions. As such, they undermine the hypothesis that orientation-invariance may be mediated by processes that are functionally distinct from those involved in orientation-dependent spatial normalisation: If this were the case, it is not obvious why spatial normalisation should be required during the derivation of the orientation-invariant representations. In this context, the generality of block, and context, effects (that is, orientation-invariant performance that is seemingly dependent either on spatial normalisation on early blocks of trials or on the prior presentation of objects in the context of misoriented forms) is of considerable theoretical interest. While the decline in orientation effects over blocks of trials has been reported in several studies (e.g., Jolicoeur, 1985; McMullen & Jolicoeur, 1992; Murray, 1999; Tarr & Pinker, 1989), the generality of block effects in the identification of other classes of object shapes that have been shown to elicit orientation-invariant performance has yet to be examined. In particular, both the studies of Jolicoeur and Milliken (1989), and of McMullen and Farah (1991), used only line drawings of common objects which, for the most part, are relatively rich in terms of their feature structure. Thus, the observation of block, and context, effects in those studies might reflect the operation of processes related to the identification of orientation-invariant shape features, rather than more generally to the encoding of other forms of orientation-invariant representation such as the axis-based global shape descriptions proposed by Tarr and Pinker (1989). If block effects do not generalise to other classes of stimuli, this would support the hypothesis that there some types of orientation-invariant mechanisms that are functionally distinct from those involved in orientation-dependent spatial normalisation. encoded within feature-specific ‘object-centred’ coordinate systems relative to a reference axis (e.g., Thacker et al., 1994). Orientation-invariant object recognition 7 We examined this issue in the current study. The goal of these experiments was to determine whether spatial normalisation also occurs on early blocks of trials with object shapes which, by hypothesis, can be encoded in orientation-invariant axis-based global shape representations. This was done using the same stimulus sets, and a modified version of the recognition-memory paradigm, reported in Tarr and Pinker (1990). During an initial learning phase, subjects were trained to recognise a target stimulus from a specific class of novel shapes. Objects were presented only at a single ‘upright’ orientation in the learning phase to avoid context effects influencing performance on subsequent blocks (Jolicoeur & Milliken, 1989). Recognition memory for the target shapes was later tested by presenting the target, and distracter, stimuli at the practiced orientation, as well as at several test orientations. Recognition latencies were then analysed, across blocks, to determine the time course involved in the derivation of object-centred representations. If spatial normalisation is indeed a general prerequisite for the derivation of orientation-invariant shape representations, then orientation-dependent performance should be found on early blocks of trials with both object sets. In contrast, if such representations can be computed using functionally independent mechanisms that do not involve spatial normalisation, then orientation-invariant performance should be found even on the initial block of test trials.
منابع مشابه
Orientation Sensitivity at Different Stages of Object Processing: Evidence from Repetition Priming and Naming
BACKGROUND An ongoing debate in the object recognition literature centers on whether the shape representations used in recognition are coded in an orientation-dependent or orientation-invariant manner. In this study, we asked whether the nature of the object representation (orientation-dependent vs orientation-invariant) depends on the information-processing stages tapped by the task. METHODO...
متن کاملComplementary Solutions to the Binding Problem in Vision: Implications for Shape Perception and Object Recognition
Behavioral, neural and computational considerations suggest that the visual system may use (at least) two approaches to binding an object's features and/or parts into a coherent representation of shape: Dynamically bound (e.g., by synchrony of firing) representations of part attributes and spatial relations form a structural description of an object's shape, while units representing shape attri...
متن کاملA comparison of the effects of depth rotation on visual and haptic three-dimensional object recognition.
A sequential matching task was used to compare how the difficulty of shape discrimination influences the achievement of object constancy for depth rotations across haptic and visual object recognition. Stimuli were nameable, 3-dimensional plastic models of familiar objects (e.g., bed, chair) and morphs midway between these endpoint shapes (e.g., a bed-chair morph). The 2 objects presented on a ...
متن کاملInvariant Shape Matching for Detection of Semi-local Image Structures
Shape features applied to object recognition has been actively studied since the beginning of the field in 1950s and remain a viable alternative to appearance based methods e.g. local descriptors. This work address the problem of learning and detecting repeatable shape structures in images that may be incomplete, contain noise and/or clutter as well as vary in scale and orientation. A new appro...
متن کاملShape defect detection in ferrite cores
In the framework of a European technological research project 2], a general method is presented for shape measurement and defect detection of industrially produced objects using the characteristic 2D projections of the objects. The method is applied to the visual inspection and dimensional measurement of ferrite cores. An optical shape gauge system is described, based on rotation-invariant shap...
متن کامل